Our team is composed by two students who are attending the Master Degree in Engineering in Computer Science.
We chose this Mini-Challenge as our Visual Analytics project.
Matplotlib Python Library
developed by J. D. Hunter
link: https://www.matplotlib.org
We used this useful library to start our inspections on the dataset by means of multiple types of representation,
such as line charts, heatmaps and scatterplots.
Microsoft Excel
Tool used for explore and filtering data, in particular for verifying the correctness of our visualizations
ColorBrewer
based on the research of Dr. Cynthia Brewer. Built and maintained by Axis Maps.
link: http://www.http://colorbrewer2.org
The best tool for correctly choosing colors and get their hexadecimal codes in order to encode them in our visualization.
d3js
Copyright 2017 Mike Bostock.
link: https://d3js.org
Javascript library we used in our Visual Analytics course and chosen for developing the visual analytics system for Mini-Challenge 2.
The duration for this submission is about 55 hours.
Answer 1
Analyzing the data at a first glance, we can detect un unexpected behaviour of the sensors. In particular
we can notice that in given hours some monitors emit two readings for a single chemical
while no reading is provided for an other chemical; furthermore one of the two readings given for a
chemical is very high with respect to the other one.
In particular, by inspecting the line charts, we discovered that the chemicals involved in this strange behaviour are AGOC-3A and Methylosmolene. From the charts presented above, analazying the green line associated to Monitor 3, it is clear that whenever a monitor seems to be turned OFF with respect to the chemical Methylosmolene, during the same hour it gives two very different readings of chemical AGOC-3A, represented by the orthogonality and the length of peaks in the first chart. We detected this behaviour in all months.
Answer 2
Now we turn our attention to the chemical release itself by summing all the readings given by all monitors
for a given hour, in order to have the total chemical emission; furthermore we arrange this derived
data in heatmaps, organized by days on Y axis and by hours on X axis.
By inspecting these heatmaps, we noticed that for the chemicals Appluimonia and Chlorodinine
we have an almost uniform release patterns; while for the chemicals AGOC-3A and Methylosmolene
we can detect a particular pattern that follows from the result we have explained in the first answer.
It is clear that the two chemicals release trends are complementary, meaning that when one presents
a series of high measurements the other shows much lower readings and the other way around.
In addition, we have noticed that the inversion of the trends between this two chemicals
happens in two moments during the day:
- around 6:00, when Methylosmolene stops emitting high values while AGOC-3A starts;
- between 18:00 and 21:00, when AGOC-3A stops emitting high values while Methylosmolene starts;
Once again, this behaviour has been found in every month but not every single day.
Answer 3
In order to identify which factory emits which chemical, we combined readings data for chemicals
releases and wind measurement; in particular, we derived the wind measurement for the
given missing hours by taking the two consecutive existing wind measurements whose hours are closest to them and
considering the wind direction moving uniformly from the first hour to the second one, deriving the
value we are interested in; as regards the speed, we simply considered the mean between those two
consecutive wind measurements.
In order to identify which factory to assign to monitors readings in a given hour, we take the angle between
the wind direction and the direction joining the monitor and the factory and assign that reading to that factory
for which this angle is minimum.
Considering the wind speed, we computed the exact hour when that value of chemical was released by
simply taking the hour of reading, subtracting to it the time needed to go from factory to monitor
with the wind speed and, finally, rounding this value to an exact hour.
This process is described in the image below, where factories are represented by red crosses and monitors
by black circles.
After that, we chose to represent by months our just derived data, through four heatmaps depicting the factories emissions of a given chemical. In such a way we can easily detect which factory emits the most of a given chemical and also discover any uncommon pattern of release.
From the analysis of the heatmaps we presented above, we can get the following results:
- for Appluimonia, Kasios and Roadrunner emits the most of chemical, immediately followed
by Indigo; Radiance's emissions are much more toned down with respect to the other factories.
- for Chlorodinine, we have a similar trend as we have for Appluimonia; in particular the most
of emissions are given by Kasios, Roadrunner and Indigo, while Radiance give a very little contribution.
- for AGOC-3A and Methylosmolene we have to analyze them together, because their release are related
through a complementary relationship, as we noticed in the answers above. In particular this
particular phenomenon is reproduced in the emissions of Roadrunner, Indigo and Radiance, while Kasios
has a much more linear emissions for this particular chemicals.
By a careful reading of the chemicals documentation, we can understand that Methylosmolene
is much more toxic with respect to AGOC-3A; for this reason we suspect that those three industries
tampered with sensors in order to maintain low emissions of Methylosmolene by transforming this emissions
in AGOC-3A emissions; this could be the reason for which we have two readings of AGOC-3A
in a single hour and none for Methylosmolene. Furthermore we can notice that the hours of
shift of this complementarity behaviour could almost coincide with the beginning and end of
factories employees working hours, possibly implying that some employees are in charge of tampering
sensors before starting to work and fixing them up before coming back home.